Acoustic correlates of meaning structure in conversational speech
نویسندگان
چکیده
We are interested in the problem of extracting meaning structures from spoken utterances in human communication. In Spoken Language Understanding (SLU) systems, parsing of meaning structures is carried over the word hypotheses generated by the Automatic Speech Recognizer (ASR). This approach suffers from high word error rates and ad-hoc conceptual representations. In contrast, in this paper we aim at discovering meaning components from direct measurements of acoustic and non-verbal linguistic features. The meaning structures are taken from the frame semantics model proposed in FrameNet, a consistent and extendable semantic structure resource covering a large set of domains. We give a quantitative analysis of meaning structures in terms of speech features across human–human dialogs from the manually annotated LUNA corpus. We show that the acoustic correlations between pitch, formant trajectories, intensity and harmonicity and meaning features are statistically significant over the whole corpus as well as relevant in classifying the target words evoked by a semantic frame.
منابع مشابه
Prosody and phonetic variability: Lessons learned from acoustic model clustering
Most research on the use of prosody in automatic speech processing has focused on F0, energy and duration correlates to prosodic structure. However, there are multiple sources of evidence suggesting that there are spectral correlates as well. This paper presents an analysis of prosodically labeled conversational speech data using acoustic parameters and clustering techniques that are standard i...
متن کاملAcoustic and articulatory correlates of speaking condition in blind and sighted speakers
Compared to conversational speech, clear speech is produced with longer vowel duration, greater intensity, increased contrasts between vowel categories, and decreased dispersion within vowel categories. Those acoustic correlates are produced by larger movements of the orofacial articulators, including visible (lips) and invisible (tongue) articulators. How are those cues produced by visually im...
متن کاملInvestigating the formal effect of rear wall structure on acoustic parameters of speech halls (Research Article)
Referring to the rear wall in a hall is the furthest element rather than the voice source, therefor the reflections of this structural member play important role in music and speech intelligibly, especially for one-third behind audiences. Hence the form of these structures can be very effective in the acoustical quality of speech halls and auditoria. In this study, four formic structures are ex...
متن کاملطراحی الگوریتم بازشناسی واجها با به کارگیری همبسته های آکوستیکی مشخصه های واجی
In the present paper, the phonological feature geometry of the Persian phonemes is analyzed in the form of articulate-free and articulate-bound features based on the articulator model of the nonlinear phonology. Then, the reference phonetic pattern of each feature that consists of one or a set of acoustic correlates, characterized by the quantitative or qualitative values in its phonological re...
متن کاملSpeaking Clearly for the Blind: Acoustic and Articulatory Correlates of Speaking Conditions in Sighted and Congenitally Blind Speakers
Compared to conversational speech, clear speech is produced with longer vowel duration, greater intensity, increased contrasts between vowel categories, and decreased dispersion within vowel categories. Those acoustic correlates are produced by larger movements of the orofacial articulators, including visible (lips) and invisible (tongue) articulators. Thus, clear speech provides the listener w...
متن کامل